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Abstract — Conventional MU-MIMO techniques, e.g. Linear 
Zero-Forced Beamforming (LZFB), require sufficiently accurate 
channel state information at the transmitter (CSIT) in order to 
realize spectral efficient transmission (degree of freedom gains). 
In practical settings, however, CSIT accuracy can be limited by 
a number of issues including CSI estimation, CSI feedback delay 
between user terminals and base stations, and the time/frequency 
coherence of the channel. The latter aspects of CSIT-feedback 
delay and channel-dynamics can lead to significant challenges in 
the deployment of efficient MU-MIMO systems. 

Recently it has been shown by Maddah-Ali and Tse that 
degree of freedom gains can be realized by MU-MIMO even 
when the knowledge of CSIT is completely outdated. Specifically, 
outdated CSIT, albeit perfect CSIT, is known for transmissions 
only after they have taken place. This aspect of insensitivity to 
CSIT-feedback delay is of particular interest since it allows one 
to reconsider MU-MIMO design in dynamic channel conditions. 
Indeed, as we show, with appropriate scheduling, and even 
in the context of CSI estimation and feedback errors, the 
proposed schemes based on outdated CSIT can have performance 
advantages over conventional MU MIMO in such scenarios. 

I. Introduction 

We consider a multiple-input multiple-output (MIMO) 
Gaussian broadcast channel modeling the downlink of a cel- 
lular system involving a base station (BS) with M antennas 
and L single-antenna user terminals (UT). A channel use of 
such a channel is described by 



Vk 



(1) 



where yk is the channel output at UT k, Vk ~ CA/"(0,7V ) is 
white Gaussian noise (WGN), G C Mxl is the vector of 
channel coefficients from the antenna of fc-th UT to the BS 
antenna array, and x is the vector of channel input symbols 
transmitted by the BS. The channel input is subject to the 
average power constraint E[||x|| 2 ] < P. 

We assume that the collection of all channel vectors H = 
[hi, . . . , he] G C MxL , varies in time according to a block 
fading model, where H is constant over a slot of length T 
channel uses (which we refer to as a coherence block), and 
evolves from slot to slot according to an ergodic stationary 
spatially white jointly Gaussian process, where the entries of 
H are Gaussian i.i.d. with elements ~ £A/"(0, 1). 

If the CSI matrix H is perfectly and instantaneously known 
to the transmitter (CSIT) and the receivers (CSIR), the ca- 
pacity region of the channel is obtained by MMSE-DFE 
beamforming and Gaussian dirty paper coding (DPC) (TJ- 
(5). In practice, however, both CSIR and CSIT are not known 



perfectly. For example in frequency division duplex (FDD) 
systems, UTs estimate CSI based on downlink pilots which are 
received with additive noise, and the transmitter is provided 
with imperfect CSI via limited and delayed feedback from the 
UTs. Given the sensitivity of DPC to CSIT accuracy, it follows 
that schemes which are more robust to CSIT accuracy such as 
LZFB are the ones considered for actual deployments (61. 

LZFB uses linear precoding to serve K out of L users 
simultaneously (with K < M), and achieves for K = M 
the maximum possible degrees -of-freedom (DoFs) of M. 
Furthermore, it has been shown that even in the presence of 
estimation errors, if CSIT feedback is obtained in the same 
coherence block and the precoder is properly designed, the 
DoFs are still preserved, although there is constant gap from 
the achievable rates under perfect CSIT (7). 

In general, however, due to inherent feedback delay, the 
channels at the time of downlink pilot training differ from the 
channels at the time of actual data transmission. If such chan- 
nels are correlated with a correlation coefficient of magnitude 
less than one, the DoFs promised by the LZFB scheme are lost 
and achievable rates saturate with increasing SNR (7). Inherent 
changes in channels over time, and practical limits on feedback 
delays in some systems, therefore create practical challenges 
even with CSIT robust schemes. 

Maddah-Ali and Tse (8) have shown that even if the CSIT 
is completely outdated (i.e., the BS has perfect knowledge 
of past channels but no knowledge of the current channels), 
it is possible to acquire DoFs greater than 1 by means of 
transmission schemes that code across multiple quasistatic 
blocks. In particular, the Maddah-Ali and Tse (MAT) scheme 
(8) makes use of multi-round transmissions and applies the 
techniques of interference alignment (I A) to realize DoF gains. 
For example, with M = 2 antennas at the BS serving K = 2 
single antenna users, a DoF of | is achievable. In general 



the DoFs of such schemes scale as 



M 



where M = K is 



log e M ' 

number of transmit antennas and simultaneously served users. 

In this paper we consider several practical aspects that arise 
in considering multi-round MU-MIMO schemes with outdated 
CSI. As a prelude, in Sec. [n] we present the system model of 
interest in this paper, along with a brief description of the MAT 



schemes from (8J. In Sec. Ill we study the effects of downlink 
training and CSI feedback on the achievable rates. For sim- 
plicity we focus on the two-user MAT scheme and show that, 
unlike conventional MU-MIMO, the achievable rates of these 
MU-MIMO schemes do not saturate with outdated CSI. 



In Sec. |IV] we develop methods for improving upon the 
achievable rates provided by outdated-CSI schemes by means 
of scheduling algorithms. With M transmit antennas, the DoFs 
are maximized via a if -user MAT scheme using K rounds 
with K = M. However, performing I A at every round gives 
rise to noise enhancement. Thus, with increasing number of 
rounds a higher signal to noise ratio (SNR) is required for DoF 
gains to materialize. As we show, by leveraging gains obtained 
through scheduling, multi-round MU-MIMO schemes, e.g., 3- 
round 3 -user schemes, can be made operationally attractive 
even at lower SNR. To obtain such scheduling benefits requires 
the use of novel packet-centric I A MU-MIMO schemes, which 
exploit the same principles as the MAT scheme, but provide 
significantly more flexibility in scheduling. Simulation exam- 
ples are provided in Section [VJ and finally, a summary with 
conclusions are given in Section |Vl| 

II. System Model and Introduction to MAT 

Throughout, we assume the presence of a sequence of 
quasistatic channels between the n-th user and the transmitter. 
We define h n [t] as the 1 x M channel between the transmitter 
and n-th user over the t-th quasistatic interval, which hereby 
is referred to as the t-th slot. We assume that the user channel 
coefficients corresponding to different users, or different slots, 
are mutually independent. We let x[t] denote the vector signal 
transmitted within slot t. The received signal of the n-th user 
at time t is given by 

y n [t] = z n [t] +v n [t] 

where v n [t) is CA/*(0, 1), i.i.d. in n and t, and where 

Zn [t} = h%[tMt] 

with h n [t] denoting the vector channel between the transmitter 
and user n in slot t. In all the succeeding sections, we 
assume that a subset of K out of L users are scheduled for 
simultaneous transmission with K = M. 

A. MAT Scheme with M=K=2 

Next we give a brief description of the two-user MAT 
scheme (8). The scheme requires M = 2 BS antennas and 
serves K = 2 users with a two-round scheme. The first round 
uses two slots, each used by the BS to transmit a message 
intended for a single user. The second round uses a single slot 
and contains a message simultaneously useful to both users. In 
particular, in round- 1 in slot t = j with j = 1, 2, the BS sends 
a 2 x 1 vector symbol Xj intended for user j (i.e., x[j] = Xj 
for j = 1, 2). As a result, user n with n = 1, 2 receives the 
following observations within slots with j = 1,2: 



y n [j] = h n [j] H Xj + v n \j], j = 1,2 



(2) 



After round- 1 the n-th user has one scalar observation of its 
intended message. It also has one scalar observation of the 
message intended for the other user, for which it is simply an 
eavesdropper. The second round transmission occurs within 
a third slot labeled slot t = 3, and consists of a message 
simultaneously useful to both users. In particular, the BS forms 



a new scalar symbol (stream) that equals the sum of the two- 
users scalar eavesdropped observations: 



x h2 =h 1 [2] H x 2 + h 2 [l] H x 1 



(3) 



The BS transmits the scalar over one linear dimension, 
e.g. over antenna 1. User n obtains the following observation: 



y n [3] = ah n [3]x 1 



>[3] 



(4) 



and where h n [3] denotes the scalar channel between transmit 
antenna 1 and user n in slot 3. The parameter a 



V2 



ensures 



that the average power constraint is satisfied. 

Using the observations from the three slots, and after 
canceling out interference, each user sees an equivalent 2x2 
channel and can thus decode its own message. For example, 
user 1 obtains 



yi [3] 



yi [i] 

-afci[%i[2] 



hi[l] H 
afci[3]h 2 [l] H 



xi 



Mi] 

5i[2] 



(5) 



whereby vi[2] — i>i[3] — a/ii[3]^i[2]. Thus, each user is able 
to decode two symbols over 3 slots, yielding DoF= |. 

We note, that in order to enable the round-2 slot (third) 
transmission the transmitter needs to have available the round- 

1 eavesdropper channels, h 2 [l] and hi [2]. Therefore it is 
assumed that the third slot (associated with coherence block 
3) occurs sufficiently later that the first two slots, to allow for 
users 1 and 2 to feed their CSI to the transmitter. Furthermore, 
and implicit in the DoF calculations, is the assumption that 
the intended user of message i has also available to it the CSI 
seem by the eavesdropper during the round 1 transmission of 
message i. That is, user 1 has available h 2 [l], i.e., user 2's 
channel during slot 1, and user 2 has available hi [2], i.e., user 
l's channel during slot 2. Hence, to enable this MAT scheme, 
each eavesdropper channel needs to be communicated by the 
appropriate eavesdropper both to the BS (in order to enable 
the MAT scheme transmissions) and to the intended receiver 
of the eavesdropped transmission. 

B. Brief Description of 3 -User MAT Schemes 

The 3-user MAT schemes [ 8 ] build upon the principles of 
the 2-user MAT scheme. They require at least three antennas 
at the BS to serve 3 users in a multi-round transmission, and 
can be operated with either 2 or 3 rounds. 

The 2-round 3-user MAT scheme uses 3 slots in the first 
round and 3 slots in the second round. In the first round, slot 
j for j = 1, 2, 3 is used to transmit a 3-dimensional symbol Xj 
to user j. After round- 1, each user has one scalar observation 
of its own message and two scalar eavesdropped observations. 
In the second round, 3 scalar messages of the form x^j are 
formed at the BS: x % is the sum of the (eavesdropped) 
observations collected by users i and j from the round- 1 slot 
transmission of each other's message. The BS then uses a 
slot to transmit each such degree-2 message. Using its round- 

2 observations, each user can then strip out the two round- 1 
eavesdropper observations of its own message. This together 
with the user's own round- 1 observation of its message allow 



the user to decode its message. This scheme yields 9 symbols 
over 6 slots and thus a DoF= 1.5. 

The 3 -round scheme uses 6 slots in round- 1, 3 slots in 
round-2, and 2 slots in round-3. In round- 1 the BS uses for 
each user 2 slots to transmit two 3 -dimensional messages to 
each user. Thus each user i then has two (round- 1) eaves- 
dropped scalar observations for messages intended for user j, 
j ^ i. As a result, the degree-2 messages formed after round- 1 
at the BS, of the form x^j, are two-dimensional. Each such 
degree-2 message, i.e., x^-, is transmitted once over a round-2 
slot (total of 3 round-2 slots). 

After round-2, an intended recipient of message x^-, e.g., 
user i, now requires a single round-2 eavesdropper observation 
to decode this message. This observation is made available 
to user i in round 3. In particular, the three round-2 scalar 
eavesdropper observations (one per user or per round-2 slot) 
are used to generate a single 3-dimensional degree-3 message. 
The BS uses two round-3 slots to transmit two linear combi- 
nations of this message (round 3). Based on its two round-3 
observations, each user can decode the 2 scalar elements (out 
of the 3 in the degree-3 message) that it does not have. This 
allows user j to also decode the degree-2 messages intended 
for the user and in turn decode its own 6-dimensional message. 
This scheme yields 18 symbols over 6 + 3 + 2 = 11 slots and 
thus DoF= 18/11 = 1.636. Note, the three round scheme 
does have higher DoF than the two round scheme. However, 
the difference is small. 

C. Brief Description of K -User MAT Schemes 

The if -user if -round MAT scheme |8] uses a BS with at 
least K transmit antennas to simultaneously serve K single- 
antenna users by means of K rounds of transmissions. The 
first round consists of Q slots, where Q is some properly 
chosen integeiQ Round r of the protocol comprises of Q/r 
slots: based on CSIT from round r — 1, the BS generates 
Q/r degree-r messages and transmits them over Q/r slots. 
Note that the CSIT required from a round-r message, i.e., 
a message simultaneously useful to r users, is the CSI of 
all K — r eavesdropping users during the transmission slot 
of that message. This is then used at the BS to regenerate 
the eavesdropper observations (without the noise) and in turn, 
degree- (r + 1) messages for round r + 1. 

Although the if -round scheme results in the maximum 
DoFs, schemes with R: 2 < R < K rounds are also attractive 
(as seen in the if = 3 examples). The first R — 1 rounds 
of an i?-round scheme are identical to those of the if -round 
scheme. The last (R-th) round in this case, however, consists of 
Q(K + 1 — R)/R transmissions of scalar degree-i? messages. 

III. Achievable Rates with Training 

In this section, we analyze the performance of the MAT 
scheme taking into account aspects of training and feedback. 
The analysis is based on immediate extensions of the approach 

! The value of Q is a multiple of K\, i.e., it is such that the number of 
transmissions required in each round for each degree-r message intended for 
each user r-tuple is an integer. 



in (7). For simplicity, we focus on the MAT scheme for 
M = if = 2 users. The case M > 2 can be handled with 
straightforward, albeit tedious, extensions of the M = 2 case. 
The 2 -user MAT scheme requires the following: 

A. Downlink training (per slot): This allows each user to 
estimate its channel in any given slot. 

B. Channel state feedback: This allows eavesdropper chan- 
nel CSI of the respective UT in any given slot to be made 
available to the BS and the intended (other) receiver. 

C. Data transmission and decoding: This includes: the 
round- 1 slot transmissions; generation and transmission 
of the round-2 messages; and decoding at each user. 

A. Downlink Training 

In order to enable channel estimation in round- 1 slots, /?i M 
shared pilots > 1 symbols per antenna) are transmitted in 
the downlink in each slot. UT k for k G {1,2} estimates its 
slot-j channel from the observation 

*k\j] = VhPh k \j]+v k \j] (6) 

and where v k ~ CAf(0,N T). The MMSE estimate of user 
fc's channel in slot j is given as 

h k [j] =E[h fe [j]s^[j]]E[s fc [i]s fe H b1]- 1 s fc [i] = /^? p s fc [j] 

- (7) 

Note that h/Jj] can be written in terms of the estimate h k [j] 
and independent white Gaussian noise n k [j] as |7J: 

h fe [i] =h k [j]+n k [j] (8) 

where n k [j] is Gaussian with covariance: 

E[n fe [jK[j]] = all, with a\ = l+ ^ P/No (9) 

B. Channel State Feedback 

To enable the round-2 transmission, each user has to feed 
back to the BS its own channel seen during the round- 1 slot 
for which it was an eavesdropper. This channel needs to also 
be communicated to the intended user of message (i.e., the 
other user). We use H = [hi[2],h 2 [l]] G C 2x2 to denote the 
imperfect eavesdropper CSI available at the BS corresponding 
to the true channel H = [hi [2], h 2 [l]]. 

This dual training for CSIT and CSIR can be accomplished 
in various ways. In this paper we assume that it is accom- 
plished by letting users take turns in time (in a round-robin 
fashion) to feed back their CSI to the BS. When a particular 
user is transmitting, all other users are silent and thus other 
UTs can also receive the CSI feedback. This is be best suited 
to situations when the users are sufficiently closely located, 
i.e., the channel between the user sending the feedback and 
the user listening is very strong 

We assume analog feedback and make the simplifying 
assumption that the feedback channel is unfaded AWGN, with 
the same downlink SNR, P/Nq, and that the UTs make use 

2 The general problem of efficient CSIT and CSIR dissemination is beyond 
the scope of this paper. 



of orthogonal signaling. The number of feedback symbols per 
antenna is given by /?/. 

Recall that each UT receives Sk [j] = \/(3iPhk [j] + [j] 
during the downlink training phase. Then, each UT transmits 
a scaled version of Sk[j] during the channel feedback phase 
and the resulting observation at the BS is given by 



gBS.febl 



VP1P+N0 
y/PiP+No 



Sfebl 



hfeb'] 



■wfcbi 

-Wfeb] 



vfeb] + wfeb] 



(10) 



where represents the AWGN noise on the uplink feedback 
channel (variance No) and v& is the noise during the downlink 
training phase. Following the analysis of (7), we can write 
in terms of as follows: 



h k [j] = h k [j] +e k [j] 



(11) 



where h k [j] and e k [j] are mutually independent and e k has 
Gaussian i.i.d. components with zero mean and variance: 



/3iP+A^ 



) 



(12) 



with (j^ = JV (1 • 



)• 



l+^P/iVV 

We assume that the feedback channel between users has a 
different SNR, given by Pi /No, which quantifies the strength 
of the channel (for example, if the users are close Pi ^> Pq). 
Proceeding on similar grounds, the MMSE estimate of the 
channel vector h k [j] is given by 



h k \j] = h k \j] + f k \j] 



(13) 



where h k [j] and f k [j] are mutually independent and f k [j] has 
Gaussian i.i.d. components with zero mean and variance: 



2 2 // 2 , PfP\P\P \ 



(14) 



with aZ = No (i + A*^). 

C. Data Transmission 

In the data- transmission portion of each slot, the BS trans- 
mits messages that comprise coded data symbols. Each such 
message is received by both receivers. Without loss of gener- 
ality we focus on user 1. Using ^ and ([8]), the observations 
of user 1 in slots j = 1,2 can be expressed as follows: 

y 1 [j]=h 1 \j]"x j + n 1 [j]"x j +v 1 [j] j = l,2 (15) 

The BS then uses its round- 1 CSIT to form the scalar message 
(hi[2] H x 2 + h 2 [l] H xi) and transmits it in the third slot. The 
resulting user-1 observation ^ can be expressed as 

2/i [3] = ^i[3]{hi[2] H x 2 + h 2 [l] H xi} + 

am [3] {hi [2] H x 2 + h 2 [1] H Xl } + Vl [3] (16) 



where a is a power normalization^] which ensures that the 
transmitted symbols satisfy the average power constrain^] 

User 1 has the estimates h 2 [1] and hi [2] of the true channels 
h 2 [l] and hi [2]. It therefore needs to compute the MMSE 
estimates of hi [2] given hi [2], and of h 2 [l] given h 2 [l]. 
Applying the results of MMSE estimation theory, we obtain: 



where 



hi[2]=hi[2]+Ci[2] 

h 2 [i] = h 2 [i] + c 2 [i] 

h 1 [2]=E[h 1 [2]|h 1 [2]] 
h 2 [l] =E[h 2 [l]|h 2 [l]] 



7 hi [2] 
7h 2 [l] 



(17) 



(18) 



ff^v and Ci[2] and £ 2 [1] have i.i.d. components 



with 7 -- Tf 

with variances given by a\ and o\ respectively. 
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pfP + No^P + No 
f P fcP 



/3 f P + N 0/ 

fifP ftPi 



(19) 



PfP+No PiP+N V (3 f P+N p f P 1 +N 0/ 
Using ( flT) and (IS) , we can re-express ( fl6| as follows: 

2/i [3] = a/n[3]{h![2] H x 2 + h 2 [l] H Xl } + 
an 1 [3]{h 1 [2] H x 2 + h 2 [l] H x 1 } + 
a(Jn [3] + m [3]) (Ci [2] H x 2 + C 2 [1] H xi ) + Vl [3] (20) 

The effective output for user 1 (after cancelation of the 
undesired signal) is then given by 



2/1 [1] 

yi[3]-a7fti[3]yi[2] 



Xl 



where 
B = 



Bx 2 + IiXi + I 2 x 2 







hi[l] H 
afti[3]h 2 [l] H 

«i[l] 

vi[3] - ahi[3]vi[2] 



Ii 



(21) 

(22) 
(23) 
(24) 



^ 1 [3](h 1 [2] H - 7 hi[2] H ) 
ni [l] H 

an 1 [3](h 2 [l] H +C 2 [l] H ) + ^ 1 [3]C 2 [l] H 


n 1 [3](h 1 [2] H +Ci[2] H ) + ft 1 [3](Ci[2] H - 7 ni[2] H ). 

D. Achievable Rate Bounds 

Bounds on the achievable rates that can be obtained with 
the MAT scheme on the basis of downlink training and analog 
feedback can be readily derived. We next derive a lower bound 
on the mutual information of user 1 , denoted by TZi , assuming 
Gaussian inputs, i.e., x k ~ £A/*(0, P/M). From (21 ) we have 



y = Axi + Bx 2 + I1X1 + I 2 x 2 + v 



(25) 



channels whose elements riave power < 1 . However, for sufficiently high SNR 
or large /3, the expected power is close to 1 with a close to -4= ■ 

4 In this phase, the BS sends pilot symbols to train the user channels hi [3] 
using only one transmit antenna. Thus hi [3] is a scalar term. 



The achievable rate with Gaussian inputs and CSI training and 
feedback is lower bounded by 

H x > ?E[log|N M AT + (AA H +BB H +I y i + I B )P/M| 



- log |N MAT + (BB H +I A + I B )P/M\] (26) 



where 
A = 



hi[l] H 
a/ii[3]h 2 [l] H 



Nmat= 



1/ 



N 
iV (l + |a 7 /n[3]| 2 ) 

Mg\ 

a 2 {a\{Mal + ||h 2 [l]|| 2 ) + M|/n[3]|V b 2 )) 


aHaUMa^h^+M^lHalW^l 



The proof of the above result follows from [7]. Bounds on 
rates for user-2 follow and are the same as user-1. 

IV. Scheduling 

The scheduling setting we consider involves an M-antenna 
transmitter and L single antenna users. We let x m (i) denote 
the i-th (coded) message intended for user m. We also let 
t m (i) denote the index of the (first-round) slot over which 
message x m (z) is transmitted, i.e., x[t m (i)] = x m (i). 

We consider scheduling algorithms in the family derived via 
stochastic optimization using the Liapunov drift technique (9), 
according to which, at each scheduling slot, t, the scheduler 
updates "weights" for each user and solves a max-weight sum- 
rate maximization problem. The weights can be interpreted as 
the backlog of some appropriately designed "virtual queues," 
that play the role of stochastic versions of Lagrangian multi- 
pliers in the associated network utility function maximization 
problem. The scheduling decision in slot t exploits knowledge 
of the transmitter-user channels, over all r: r < t. For 
simplicity, we assume that, for each past transmission that 
CSIT is available for scheduling from all L users. Also, we 
focus on the case where all users have the same SNR and the 
scheduling criterion is the expected sum user-rat^] 

In order to appreciate the potential challenges and benefits 
of scheduling for MU-MIMO with outdated CSI, it is worth 
contrasting it against scheduling for conventional MU-MIMO. 
In conventional MU-MIMO, CSIT is collected about the 
channels between the transmitter and multiple users. The 
scheduler at the transmitter uses this CSIT to select a subset 
of users for MU-MIMO transmission along with a precoder. 
The assumption with scheduling conventional MU-MIMO is 
that the channels based on which CSIT is obtained and the 
channels over which the MU-MIMO transmission takes place 
are sufficiently correlated (they differ by an error with a 
sufficiently small variance) 17). 

Much like with conventional MU-MIMO, CSI from mul- 
tiple users can be exploited to schedule joint MU-MIMO 
transmissions with outdated CSI to optimize some system 



utility metricj^] The key difference here is that these are multi- 
round schemes, whereby the MU-MIMO transmissions at a 
given round are "joint" transmissions of several eavesdropped 
messages from the previous rounds, and thus only exploit 
CSIT from past rounds. Furthermore, as all CSIT available 
is from past transmission slots only, the exact timing of the 
scheduled transmissions does not matter. 

We first consider MAT session schedulers, i.e., schedulers 
that schedule packets from different sets of users into MAT 
sessions. We then consider a class of schedulers that schedule 
multi-round MU-MIMO transmissions based on outdated CSI 
in a more flexible manner. 

A. MAT-Session Scheduling 

We first consider the 2-user MAT (MAT-2) session schedul- 
ing problem in detail. We then briefly comment on extensions 
for the 2-round if -user problem and then the R-round K-user 
scheduling sessions, with R < K, K < M and K < L. 

The 2-user MAT- session scheduler schedules pairs of user 
packets of the form (x m (i), x n (j)) with m^nin two-round 
MAT sessions. Note that, since the round- 1 transmissions 
involve individual user messages, the pairing decisions need 
only occur just prior to the second round transmission. Pairing 
involves the sum of the eavesdropped observations from first- 
round transmissions. 

Given a MAT session between the i-th packet of user m 
and the j-th packet of user n, its round-2 slot is denoted by 
tm,n(hj), and satisfies t m , n (ij) > max{t m (i), t n (j)}. The 
associated transmitted signal is given by 

q «m,n(^jO( h n[^m(0] X m(i)+h^[t n (j)]x n (j)) 



c [^m,n(^j)] — 



whereby the scaling constant a m>n (z, j) is chosen so as to 
ensure constant power transmission, i.e., 



V2 



VllhSt^^lP+llh^tuO-)]!! 2 ' 
For convenience, we focus on a fixed-buffer size scheduler. 
In particular, we assume that at each scheduling instance (i.e., 
each time a round-2 transmission is to be scheduled) the 
scheduler has available CSI from all L users on LN round- 
1 slots, and exactly TV of these slots carried messages for a 
given user. Once a round-2 transmission is scheduled between 
some packet i of some user m and some packet j of some user 
n, this transmission is also accompanied by two new round- 1 
transmissions of fresh packets to users m and n. 

The optimal scheduling algorithm in this case is then 
straightforward to derive. At any given scheduling instance, 
the scheduler has CSIT for packets x m (z) for 1 < i < N and 
1 < m < L (without loss of generality the packet indices of 
each user are indexed from 1 to N). The scheduling problem 
reduces to the following optimization (9) 

(m* f) = argmax Q m AR mii (nJ)^Q n AR n j(m,i) 

(m,i,n,j): 

l<ra<n<L, l<i,j<N 

(27) 



5 The general unequal SNR case with a general system- wide utility metric 
can be similarly captured with appropriate extensions j9j. 



6 Scheduling requires the CSI of a UT in slots for which it receives its 
intended message. This is an added requirement over the basic MAT scheme. 



where Q m denotes the optimization weighj^] of user m, 
and where A J R m ^(n, j) is the expected mutual-information 
increase to user m by performing a round-2 transmission that 
completes an existing (in progress) MAT-2 session between 
x m (i) and x n (j). This expected increase is given by 



i Round-2 Pairing Queue for users m and n 



AR mA (nJ) = R m ,i(nJ) - R n 



where 



Rm t i(nJ) = % log(det[/ + Kj 1 H m>n H^ l>n P/2]) 

is the expected mutual information provided by the MAT 
session to user m after performing interference alignment, 



with H r 



h m [t m (i)] 

V2f 



K7 1 



N 

2i/r 



h n [t m (i)] 



^(l+iihH^^jip + Hh^^yjjip) 



(28) 



The quantity 



#m,i = log [ 1 + ^r\\hm[tm(i) 



is the mutual information from the round- 1 transmission of 
packet i, and R m is the expected mutual information from a 
round- 1 transmission of a new packet for user m. 

The above scheduling approach can be generalized to in- 
volve if -user i?-round MAT sessions. However, the scheduling 
benefits are very limited due to the restrictive eavesdropper 
nature of the MAT-session. To see this consider a 3-user 2- 
round MAT-session scheduling scheme. In such a scheme, 
3 dimensional messages are transmitted from 3 antennas to 
3 users using 3 round- 1 slots and 3 round-2 slots. The 
scheduler in this case would choose, for round-2, three-user 
MAT sessions between packets ii, Z2, 13, of users mi, 7772 
and 777,3, respectively, based on eavesdropper CSIT from the 
round- 1 transmission of these packets. In particular user 777 & 
gets three looks at its packet, one through its own channel and 
two more through the two eavesdropper channels (all at the 
same time). The set of these three channels must constitute a 
"good" 3x3 channel (in the sense that the expected rate of user 
k after the round-2 transmissions has to be sufficiently high). 
Furthermore, this has to simultaneously happen for all 3 users. 
As a result, the number of scheduling options required to get 
simultaneously good rates to all users grows exponentially fast 
with the number of users. 

Another limitation of MAT-session based scheduling is that 
the MAT session is completely determined by the completion 
of the second round, regardless of the total number of rounds. 
Hence, when scheduling MAT sessions with more than 2 
rounds, once the second round is completed the rest of the 
session has been fully determined and no further scheduling 
benefits are to be expected. 

7 The weight, Qm, of user m at a given scheduling slot, t, is provided to 
the scheduler and is simply the output of the virtual-queue process of user m 
at time t (5J. 
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Fig. 1. Sub-message pairing into degree-2 messages for user pair (m, n). 
B. Eavesdropper-Based Packet-Centric Scheduling 

In this section we consider a different approach for schedul- 
ing MU-MIMO transmissions with outdated CSI. It is based 
on enabling packet-centric (rather than MAT-session based) 
interference alignment for efficient MU-MIMO transmission. 
This scheme exploits the same principles as the MAT scheme 
and achieves the same DoFs as the MAT scheme. In particu- 
lar, consider i?-round if -user protocols with a packet-centric 
scheme. The scheme has the following properties: 

• Much like the i?-round if -user MAT scheme, for each 
round r with 1 < r < R — 1, and for each degree- 
r message (i.e., a message simultaneously useful to 
r user terminals) that is transmitted, a set of if — r 
eavesdropper observations are communicated to each of 
the r intended receivers, by means of "network-coded" 
IA-enabling transmissions in the following rounds; 

• unlike the MAT scheme, however, the if —r eavesdropper 
observations are not preselected based on the MAT ses- 
sion; rather they are chosen based on the channel quality 
of the eavesdropper channels. 

To illustrate the difference between the two schemes con- 
sider first the problem of scheduling round-2 transmissions 
for a 2-round if -user MU-MIMO packet-centric scheme. This 
scheme relies on the use of a set of (m, n) user- terminal pair- 
ing queues of the form shown in Fig[T] which generate degree- 
2 messages for round-2 transmissions. The main principles 
behind packet-centric eavesdropper-based scheduling can be 
summarized as follows: 

1) Each round- 1 slot involves transmitting a K dimensional 
message intended for one of the L users. 

2) For each round- 1 transmission intended for a given user, 
say user m, the base-station chooses if — 1 out of the 
L — 1 eavesdroppers for round-2 transmissions (based 
on round- 1 eavesdropper CSIT). 

3) For each such eavesdropper, e.g. eavesdropper n, the 
base-station places the eavesdropped observation of user 
77 in the corresponding (777,77) queue, in the queue input 
associated with eavesdropper 77. 

4) Degree-2 messages for the user pair (777, 77) are formed 
by combining sub-messages from the queues of eaves- 
droppers 77 and 777 within the (777,77) queue. These 
messages then simply wait for (round-2) transmission. 

It is interesting to contrast eavesdropper scheduling with MAT- 
sessions scheduling in the case if = 2. In this case, and 
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Fig. 2. MAT-session based vs. packet-centric scheduling: the 2 user case. 
TABLE I 

Sample 3-user 2-round MAT-Session Scheduling between 
packets 6, 13, and 27 of users 1, 2, and 3, respectively 



Messages for: 


xi (6) 


x 2 (13) 


x 3 (27) 


Round 1 


xi (6) 


x 2 (13) 


x 3 (27) 


Round 2 


xi, 2 (6,13) 
xi, 3 (6,27) 


xi j2 (6,13) 
x 2 , 3 (13,27) 


xi, 3 (6,27) 
x 2j3 (13,27) 



TABLE II 

Sample 3-user 2-round Packet- Centric Scheduling for packets 
6, 13, and 27 of users 1, 2, and 3, respectively 



Messages for: 


xi (6) 


x 2 (13) 


x 3 (27) 


Round 1 


xi (6) 


x 2 (13) 


x 3 (27) 


Round 2 


xi, 2 (6,13) 
xi, 3 (6,27) 


xi, 2 (6,13) 
x 2 , 4 (13,9) 


xi, 3 (6,27) 
x 3 ,s(27,4) 



given CSIT (from all users) from the round- 1 transmission 
of message i for user ra, the eavesdropper scheduler in step 
2) above selects one eavesdropper out of all the users via 



n*(ra,z) = argmax n:1 < n < L>n ^i? m>i (n) 



(29) 



replacing ||h» [t n (j)}\\ with \\bi*[t m (i 
RmA n ) = log(det[/+H 



where R m ^(n) is a heuristic objective function obtained by 

in 

H 



P/2}) 



with 



l 



.(0] 



(30) 



y/l+\\K[tm(i)]\\ 

Fig. [2] shows a performance comparison between the heuris- 
tic packet centric scheduler ( [29] ) and the MAT-based one (27), 
assuming L = 20. As the figure illustrates, both schedulers 
yield nearly identical performance. Heuristic approximations 
of the form ( [29] ) can be readily used for implementing packet 
centric schedulers with i\~-user schemes, where K > 2. 

Note that the DoFs of 2-round if-user packet-centric ses- 
sions are the same as the DoFs of the associated 2-round 



8 Although this is a heuristic approximation, in principle the objective could 
be validated by proper matching of eavesdropper observations at the combiner 
of the (ra, n) queue, such that eavesdropper channels of roughly equal norms 
are combined to generate degree-two messages. As Fig. [2] suggests, however, 
such careful combining is not necessary. 
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Fig. 3. Sub-message combining into degree-3 messages for users (m, n, q). 

iv"-user MAT session. However, there is significantly more 
flexibility in scheduling eavesdroppers. Tables [I] and [II] provide 
examples of scheduling in a 2-round 3-user MAT session and 
of scheduling in a 2-round 3-user packet-centric approach. 
Both involve packet 6 of user 1, packet 13 of user 2, and 
packet 27 of user 3. The i-th column in each table shows all 
the transmitted messages associated with the packet of user 
i. As Table [I] shows, all transmissions in the MAT-session 
based scheme are determined by the scheduled MAT session 
involving the three user-packets. In contrast and as Table [II] 
shows, in the packet-centric scheme, the packets of user 2 
(and 3) are no longer restricted to be included in transmissions 
involving packets of user 1 and 3 (1 and 2). Rather, the 
eavesdroppers in each case are chosen independently, and it 
is up to the pairing queue to group them into degree-two 
messages. 

The preceding two-round schemes can readily extended 
to develop i?-round if-user packet-centric schemes. As an 
example, consider the 3-round 3-user scheme. In this case, 
round-2 scheduling uses pairing queues of the form of Fig. [T] 
and works as already described. Round-3 scheduling amounts 
to scheduling eavesdroppers for degree-2 messages, i.e., mes- 
sages simultaneously useful to 2 users. Given eavesdropper 
CSIT from round-2 transmissions intended for a particular pair 
of users (ra, n), an eavesdropper is selected, e.g. user q, out of 
all L — 2 eavesdroppers. This eavesdropper's message enters 
a round-3 pairing queue where it is used to create degree-3 
messages for transmission. In particular, it is an input to the 
user q eavesdropper queue of the (m^n^q) message queue, 
shown in Fig. [3] As shown in the figure, degree-3 messages 
(messages simultaneously useful to a triplet of users (ra, n, q)) 
are constructed by combining three eavesdropped observations 
of degree-two messages, one for each user eavesdropping on 
a message intended for the pair of remaining users. 

V. Simulation Results 

In this section we provide a brief performance evaluation 
of the MU-MIMO schemes based on outdated CSI. We first 
provide a comparison between the MAT scheme and a conven- 
tional MU-MIMO scheme employing LZFB in the context of 
training and feedback over time- varying channels. We assume 
a block fading channel model that evolves according to a 
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Fig. 4. Comparison between MAT and conventional LZFB for a system with 
2 antennas at the BS serving two single-antenna users. 

Gauss Markov process. The effect of the delay between the 
time t p i of downlink pilots (slots in which CSIT is estimated) 
and the time t p i + 1$ of MU-MIMO transmission is captured 
by the magnitude of the expected correlation between channels 
from such a pair of time slots. This coefficient is defined by 

p(t s ) = |E t [\$[t]h k [t + t s ]]\/E t [\\h k [t}\\ 2 ] 

A value p = 1 means that the channels are perfectly correlated 
(co-linear), which happens if the CSI acquisition and data 
transmission occur in the same coherence block, whereas 
p < 1 indicates that the channels changed between slots. 

Fig. [4] shows a comparison between the conventional LZFB 
and the MAT scheme for the case of M = K = 2 in 
the presence of training, and assuming f3\ = /3f = 2 and 
P = P\. When p = 1 LZFB achieves 2 DoFs, as expected 
(7). There is a constant rate gap between the perfect CSI case 
and the case of imperfect CSI based on training, as justified 
in (7). In contrast for all cases with p < 1, even with a very 
high correlation of p = 0.99, the achievable rates eventually 
saturate as SNR increases (7). Furthermore, decreasing p 
below 0.99 results in significantly lower saturation rates. 

This rate saturation is not seen with the MAT schemes, 
which achieve DoF=| independent of p. Using downlink train- 
ing and CSI feedback degrades the achievable rate, however it 
does so by a constant gap regardless of the value p and similar 
to what was observed in [7] for LZFB when p = 1. 

Fig. [5] shows some of the benefits of packet-centric schedul- 
ing as a function of the number of users served by the MU- 
MIMO scheme and the number of rounds used for transmis- 
sion. In particular, the figure shows a performance comparison 
of the "packet-centric" based scheduler for K = 3 users 
with two and three rounds of transmission, as well as the 
scheduler's performance for the K = 2 user case. Also shown 
in the figure is the performance of the associated MAT-session 
based schemes without scheduling. As the figure suggests, the 
packet centric scheduler achieves the DoFs promised by the 
associated MAT scheme. In addition, packet-centric scheduling 
offers more flexibility when scheduling users, enabling perfor- 
mance benefits to be realized with a 3 -round 3 -user scheme 
at lower SNRs. 
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Fig. 5. Performance of packet-centric scheduling in the case of two-user and 
three-user packet-centric MU-MIMO schemes based on outdated CSI. 

VI. Conclusion 

In this paper we considered training and scheduling aspects 
of multi-round MU-MIMO schemes that rely on the use of 
outdated channel state information (CSI) (8j. Such schemes 
are of practical interest as they enable one to rethink many op- 
erational aspects of deploying MU-MIMO in dynamic channel 
conditions, conditions which can inherently limit conventional 
MU-MIMO approaches. As shown in the paper, under proper 
training, the degrees of freedom promised by these schemes 
can be realized even with fully outdated CSI. We also proposed 
a novel scheduling algorithm that improves the performance 
of the original MAT scheme (8). It is based on a variant 
multi-user MIMO scheme which maintains the MAT scheme 
DoFs but provides more scheduling flexibility. As our results 
suggest, an appropriately designed MU-MIMO scheme based 
on multi-round transmissions and outdated CSI can be a 
promising technology for certain practical applications. 
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